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^ ' Abstract 

Ft I , We introduce a search problem called "mutual search" where k agents , arbitrarily distributed 

over n sites, are required to locate one another by posing queries of the form "Anybody at site 

i?". We ask for the least number of queries that is necessary and sufficient. For the case of two 

agents using deterministic protocols we obtain the following worst-case results: In an oblivious 

*^ I setting (where all pre-planned queries are executed) there is no savings: n— 1 queries are required 

and are sufficient. In a nonoblivious setting we can exploit the paradigm of "no news is also 

C/j [ news" to obtain significant savings: in the synchronous case 0.586n queries suffice and 0.536n 

O ■ queries are required; in the asynchronous case 0.896n queries suffice and a fortiori 0.536 queries 

are required; for o{^/n) agents using a deterministic protocol less than n queries suffice; there 

is a simple randomized protocol for two agents with worst-case expected 0.5n queries and all 

<*" I randomized protocols require at least 0.125n worst-case expected queries. The graph-theoretic 

Jli^ ' framework we formulate for expressing and analyzing algorithms for this problem may be of 

^^ I independent interest. 
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Search problems come in many forms |11]. Perhaps the following one is new: Suppose you and a 



friend check separately into the same hotel in Las Vegas in different rooms. For reasons we don't go 



X 
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into here you both don't want to draw any attention to your relation. You are supposed to phone 
one another at noon, but unfortunately you don't know each others' room number. What to do? 
Every room contains a room phone and room number. You can phone all other rooms in the hotel 
to find your friend and she can do the same (if the wrong person picks up the phone you simply 
hang up and nobody is the wiser). This will cost a lot of time: there are 1000 rooms. In the worst 
case you use almost 2000 room calls together. Luckily you and your friend know the protocol in 
this paper: you locate one another using only 586 room calls together in the worst case. There are 
more serious problems of the same nature that are listed in the Appendix ^ 

In general we can think of A; > 2 agents having to find each others' locations in a uniform 
unstructured search space consisting of n sites (n > k). The sites have distinct identities, say 
0, ... ,n — 1 {k < n), every site can contain zero or one agent , and the agents execute identical 
protocols based on the values n, k with their site identity as input. The agents can execute queries 
of the form "Anybody at site i" and every such query results in an answer "yes" or "no." We say 
that two agents know each others' location as soon as one agent queries the location of the other 
agent or the other way around. Before that happens they don't know each other's location. The 
relation "know location" is transitive and the problem is solved if all k agents know one another's 
location. This type of search can be called Mutual Search [MS). We analyze the cost in number 
of queries for the case k = 2 under various timing assumptions for deterministic and randomized 
algorithms. We also give a result for the general case oi k = o{y/n) agents . 

Our Results: We first look at deterministic protocols for two agents . If the protocol is 
oblivious, so that the cost for each agent is a fixed number of queries, then there are no significant 
savings possible: two agents need to place at least n— 1 queries in total in the worst case. |^ Namely, 
given a protocol, construct the directed graph on {0, . . . , n — 1} with an arc from i to j if an agent 
at i queries node j. For every pair there must be at least one arc. Hence there are at least (2) 
arcs in total, and the average number of outgoing arcs per node is at least ^^. It follows that 
some pair of nodes must together have twice this number, or n — 1, of outgoing edges (this can be 
refined to 2[^^^] for all n > 2). The tightness of this bound is witnessed by an algorithm called 



HalflnTurn, to be discussed in Section 2.2: 



Oblivious case (A; = 2): Both upper bound and lower bound are 2[^^yi] queries in the determin- 
istic worst-case. 

In the remainder of the paper we analyze the nonoblivious case. We obtain savings by exploiting 
the information inherent in timing ("no news is also news") and a prescribed order of events. 



Synchronous case {k = 2): in Section 2^ we present the protocol SR^, an algorithm with a 
worst-case cost of only (2 — \/2)n ~ 0.586n. We also show this algorithm to be close to optimal, 
by proving a (4 — 2^/?>)n ~ 0.536n lower bound on the number of queries required by any mutual 
search algorithm in Section |2.4| . 

Asynchronous case {k = 2): In Section y we show that there is a mutual search algorithm that 
uses asymptotically 0.896n queries. The best lower bound we know of is the 0.536n lower bound 



in Section 2.4. 



Randomized case (A; = 2): We consider randomized algorithms for the problem in Section ^. A 
synchronous randomized algorithm is shown to have a worst-case (over agent location) expected 
(over random coin fiips) cost of about ^^^, thus beating the deterministic lower bound. We show 



^ "Oblivious" means that the queries scheduled at certain time slots take place independent of the replies received. 
The same lower bound holds if there is no FIFO discipline: the answer to a query can arrive only after the following 
queries are executed. This is the case when the sites are nodes in a computer network, the agents are processes at 
those nodes that query by sending messages over communication links with unknown bounded (or possibly unbounded 
as in the FLP model [p|) communication delay without waiting for the answers to earlier messages. Then, a process 
may have to send all its messages before an affirmative reply to one of the early messages is received. 
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a lower bound on the worst-case expected number of queries of „ . 

Synchronous multi-agent case {k = o{^/n))•. In Section we present RS^^^, a deterministic 

algorithm for k > 2 agents with a cost well below n for all k = o{y/n). 

The framework we develop for reasoning about the Mutual Search problem may be of inde- 
pendent interest. Mutual search can serve as a preliminary stage to sharing random resources in 
a distributed setting or forming coalitions for Byzantine attacks and various cryptographic settings. 

Related Work: The authors believe that this is a novel type of search problem that has not been 
considered before. We do not know of any directly related previous research. Several topics that 
are more or less related can be found in the Appendix ^. 

2 Synchronous Case For Two Agents 



In Section p.l| we formulate the model for the synchronous case with k = 2 agents located at n sites 
and give a framework for expressing and analyzing the structure of algorithms for this problem. We 
analyze this case fairly completely, but in later sections we also present results for other instances 
of the MS problem. 

2.1 Model and Definitions 

Consider n sites numbered 0, . . . , n — 1 with k < n agents distributed over the n sites with zero 
or one agent per site. Time is discrete, with time slots numbered 0, 1, . . .. An agent at site i can 
perform queries of the form q = q(i, j) with the following semantics: if site j contains an agent then 
the associated answer from site j to the agent at i is 1 (yes) otherwise (no), < i ^ j < n — 1. For 
definitional reasons we also require an empty query _L (skipped query) with an empty associated 
answer (skipped answer). The query and answer take place in the same time slot. Given the 
number n of sites and the number k of agents , a mutual search protocol A consists of a (possibly 
randomized) algorithm to produce the sequence of queries an agent at site i (0 < i < n — 1) 
executes together with the time instants it executes them: A{i) = qi, . . . , q^m where qt is the query 
executed at the tth time slot, t := 0, . . . , m,. If qt =-L then at the tth. time slot an agent at site i 
skips a query. A mutual search execution of k agents located at sites ii, . . . ,ik consists of the list 
A = ^(^i), . . . , A{if:). We require that in every time slot t there are zero or one queries from this 
list that are 7^_L. Hence we can view ^ as a total order on all queries by the k agents and A(i) as a 
restriction to the entries performed by an agent at site i {i := ii, . . . , i^). The cost of an execution 
is the number of queries qti^^ with t <tQ and to is the least index such that the answer to query 
qtg equals 1. That is, we are interested in the number of queries until first contact is made. The 
{worst-case) cost of a mutual search protocol is the maximum cost of an execution of the protocol. 
The worst-case cost of mutual search is the minimum (worst-case) cost taken over all mutual search 
protocols. 

In this paper we consider the case of A; = 2 agents unless explicitly stated otherwise. The case 
k > 2 is open except for the result in Section ^. Informally, a mutual search protocol specifies, for 
every site that an agent can find itself in, what to do at every time slot: either stay idle or query 
another site as specified by the protocol and receive the reply. Every time slot harbours at most 
one query and its answer. |^ 

^This is the cost of a non-oblivious execution. The cost of an oblivious execution is simply the number queries 7^± 
occurring in the hst A. This case was already completely analyzed in the Introduction. Therefore, in the remainder 
of the paper we only consider non-oblivious executions without stating this every time. 

^We can allow simultaneous queries. If there are k agents , then every time slot can have at most k queries 7^±. 



For every pair of sites such a protocol determines which site will first query the other. After the 
first such query takes place, the execution terminates, so that the latter site need never execute the 
now redundant query of the former site. Every such algorithm implies a tournament: a directed 
graph having a single arc between every pair of nodes. An edge from node i to node j represents 
site i querying site j. The different times at which the (2) queries/edges are scheduled induce a 
total timing order on the edges. Since the cost of running the algorithm depends only on which 
queries are made before a certain contacting query, this total order by itself captures the essence 
of the timing of queries. 

An algorithm can thus be specified by a tournament (telling us who queries whom) plus a 
separate total timing order on its edges (telling us when). Note that the timing order is completely 
unrelated to the ordering of the arcs; the tournament may well be cyclic in the sense of containing 
cycles of arcs. ^ For us an ordered tournament is a (tournament, order) pair where the order is a 
separate total order on the arcs of the tournament. 

Definition 1 An algorithm, for MS is an ordered tournament T = {V, E, -<), where the set of nodes 
(sites) is V = {0, 1, . . . , n — 1} , E is a set of {^) = ^n{n — 1) edges (queries), and ~< is a total 
order on E. For a node i, Ei is the set of outgoing edges from i, and is called row i. The number 
of queries \Ei\ is called the length of row i. 

This way Ei is the set of queries agent i can potentially make. Define the cost of an edge as the 
number of queries that will be made if the two agents happen to reside on its incident nodes. 

Definition 2 The cost c{T) of an algorithm T is the maximum over all edges e = {i,j) of the edge 
cost c(e) = l-Ej^^^l + 1 + l-E'/'^l; where for any F Q E, F^^ denotes {f e F : f ^ e}. 

If the agents are located at nodes i and j and the edge e between them is directed from i to 
j, then at the time i queries j, agent i has made all queries in E^'^, while agent j has made all 
queries in i?^^. We have now all what is needed to present and analyze some basic algorithms for 
the problem which will form the basis of a better algorithm. 

2.2 Some Simple Mutual Search Algorithms 

The first algorithm, AllInTurn^, lets each site in turn query all the other sites. For instance, 
AllInTurn4 can be depicted a^ 



1 2 3 

2 3 

3 



Here, the sites are shown as labeling the rows of a matrix, whose columns represent successive 
time instances. A number j appearing in row i and column t of the matrix represents the query 
from i to j scheduled at time t. As an example execution, suppose the agents are situated at sites 
and 2. Then at time 0, (the agent at site) queries 1 and receives reply "no": the second agent 
is not there. Next queries 2 and contacts the second agent, finishing the execution at a cost of 2 
queries. 

The precise cost then depends only on how we account the at most k queries in the time slot containing the first 
query with answer 1. Under different conventions the results can only vary by fc — 1, that is, by only 1 unit for k — 2. 

^For example, algorithm HalfInTurn,i below has cycles of arcs. 

^ Another way would be to draw the tournament on nodes 0, . . . , 3 and label every arc with a time. The matrix 
representation we use seems more convenient to obtain the results. 



Lemma 1 Algorithm AUInTurrin has cost n — 1. 

Proof. It is in fact easy to see that c{i,j) = j — i. A agent at site i makes this many queries 
to contact the other agent at site j, and the latter never gets to make any queries. The maximum 
value of J — i is n — 1. □ 

A somewhat more balanced algorithm is HalfInTurn„, where each site in turn queries the next 
[n/2j sites (modulo n). HalflnTurns looks like 
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For even n, sites n/2 ... n — 1 only get to make [n/2\ — 1 queries. 

Lemma 2 Algorithm HalflnTurrin has cost n — 1. 

Proof. Suppose i < j. If i — « < L^/2j > we find c{i,j) = j — i, otherwise i — j mod n = 
n + i — j giving c(j, i) = [n/2j + n + i — j. Taking j = i + \n/2\ + 1 achieves the maximum of 

[n/2j +n- ([n/2j +1) =n- 1. D 

Our next result shows HalflnTurn^ to be the basis of a much better algorithm. 

Definition 3 An algorithm is called saturated if its cost equals its maximum row length. 

An example of a saturated algorithm is AllInTurn„, whose cost of n — 1 equals the length of 
row 0. 

Lemma 3 An algorithm that is not saturated can he extended with another site without increasing 
its cost. 

Proof. Let T be an algorithm on n nodes whose cost exceeds all row lengths. Add a new node 
n, and an edge from every other node to this new node. Order the new edges after the old edges 
(and arbitrarily amongst each other). This does not affect the cost of the old edges, while the cost 
of edge (i, n) becomes one more than the length of row i, hence not exceeding the old algorithm 
cost. □ 

As the proof shows, the maximum row length increases by exactly one, so we may add as many 
sites as the cost exceeds the former. HalfInTurn2fc+i has cost 2k and uniform row length k so we 
may add k more sites to get a saturated algorithm SaturatedHalfInTurn3fc-|-i of the same cost: 

Corollary 1 Algorithm SaturatedHalflnTurrin has cost \^{n — 1)]. 

2.3 Algorithm Refinement 

In order to get a better understanding of the structure of MS algorithms, we need to focus on their 
essential properties. In this section we consider algorithms with only a partial edge ordering. The 
question arises how such a partial ordering can be extended to a good total edge ordering. The 
following terminology helps us answer this question. 



Definition 4 A partial MS algorithm is a partially ordered tournament T = {V,E,~<,R), where 
R O E is the subset of retired edges, and -< is now a partial order, which: 

• totally orders R, 

• orders all of E — R before all of R, and 

• leaves E — R (pairwise) unordered. 

An edge e = {i,j) in row prefix Ei — R has retiring cost c(e) = \Ei — R\ + \Ej — R\. Retiring 
an edge e results in a more refined partial algorithm T = (V,E,~<',R'), where R' = RU {e} and 

^'=^[J{E-R',e). 

Note that relation -< is viewed as a set of pairs; {E — R', e) denotes the set {{f,e):fGE — R'}. 
The edge e that is added to R was -< R and since -<' extends ^, becomes the new earliest edge in 
R'. 

Algorithm refinement proceeds backward in time — the queries to be made last are scheduled 
first. An example partial tournament, with 2 retired edges, is 

(0,3) 

(0,1) . X . X 

;,;,; .(2, 3). (3,1) 

(2,0) 

Note that any sequence of \E — R\ refinements yields a (totally ordered) algorithm, which we call 
a total refinement of T. A mere tournament corresponds to a partial algorithm with no retired 
edges. 

Observe that the cost of e in a total refinement depends only on its ordering with respect to 
the edges in rows i and j, which is determined as soon as it retires. This shows the following 

Fact 1 IfT' results from T by retiring edge e = ii,j), then the retiring cost of e equals the cost of 
that edge in any total refinement of T' . 



Definition 5 The cost c{T) of a partial algorithm T is the minimal cost among all its total refine- 
ments. A total refinement achieving minimum cost is called optimal. 



Lemma 4 The cost of a partial algorithm T equals the cost of the partial tournament that results 
from retiring the edge e of minimum retiring cost. 

Informally, any refinement from T will have cost at least the minimum retiring cost, and choosing 
e doesn't hamper us in any way. The following proof makes this notion of "non-hampering" precise. 

Proof. Consider an optimal total refinement from T to some algorithm T", in which, at some 
point, say after ei, 62, . . . , Cfc, edge e is retired. Let algorithm T' be the result of retiring e first, and 
then continuing the same total refinement with e skipped. Then T" will have e -<" e^ <" ■ ■ ■ -<" e\ 
whereas T' has Ck <' ■ ■ ■ <' ei -<' e. If we compare the costs c" and c' for any edge in T" and T' 
respectively, we see that for 1 < i < /c, c'(ej) < c"{ei), c'(e) > c"(e), and all other edges cost the 
same. However, c'(e) < c(ei) by assumption, and so T' must be optimal too. □ 



Since optimal refinement is straightforward greedy procedure that can be performed automat- 
ically an optimal timed algorithm is uniquely determined by just its associated tournament. By 
graphically showing the tournament's adjacency matrix, one obtains a visually insightful represen- 
tation; for instance, SaturatedHalflnTurnia is shown in Figure |l| 

Our algorithm HalflnTurn„ now betrays a bad ordering for even n. It retires (n — 1,0) first, at 
a cost of n — 1, whereas an optimal refinement can keep the cost down to n — 2. It takes advantage 
of the bottom rows being shorter, and first retires an edge between nodes in this bottom half. For 
example, the following reordering of IIalfInTurn4 has cost 2: 



(0,1) ^ (3,0) < (0,2) ^ (1,3) -< (1,2) -< (2,3) 



2.4 Lower Bounds 



Given that the maximum row length is a lower bound on the cost of the algorithm, the following 
result is easily obtained. 

Lemma 5 Every MS algorithm T for n sites has cost at least \^~\ . 

Proof. The average outdegree of a node in T is {^)/n = ^^, so some row has length at least 
[^^^] = [^J . It remains to show that for odd n, an algorithm of cost ^^^ is not possible. This 
is because for any collection of n rows each of length ^^^, the last edge on every row has retiring 
cost^ + ^=n-l. □ 

The last argument used in the proof shows that the sum length of the shortest two rows is a 
lower bound on an algorithm's cost. An algorithm of cost c thus necessarily has a row of length at 
most c/2. Careful analysis allows us to prove the following generalization: 

Lemma 6 Let T be an MS algorithm for n sites with cost c. Then the {k + l)st shortest row of T 
has length at most c/2 + k. 

Proof. Let e = (i,j) be the last edge for which i and j are not among the shortest k rows. 
Consider the moment of e's retirement in the refinement from the unordered tournament in T to 
T. Since R includes at most k edges from each of the rows i and j, the retiring cost of e equals 
c(e) = \Ei - R\ + \Ej -R\> \Ei\ -k + \Ej\-k> 2{mm{\Ei\,\Ej\) - k). Furthermore, c(e) < c, 
since the cost of T is the maximum of all retirement costs. It follows that the smallest of rows i 
and j has length at most c/2 + k. □ 

This shows that the best possible distribution of row lengths looks like U, where the (g) entries 
are divided over n — c/2 rows of maximum length c, followed by c/2 increasingly shorter rows, 
producing a triangular "wasted" space of size about (c/2)^/2. 

Theorem 1 Every MS algorithm T for n sites has cost at least (4 — 2v3)(n — 1) (~ 0.536n). 

Proof. Since every row has length at most c, Lemma |^ implies: 

, „, n(n — 1 .r-^ , 

|£^| = -i ^- <nc-Y,c/2-k 

(c/2)(c/2 + l) 
= nc , 



which in turn imphes: 

^ (c/2)2 - 2(n - l)c + (n - 1)2 < 1 - 1.5c - n < 0. 
Solving for c, we find c > (4 — 2-v/3)(n — 1). 
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2.5 Algorithm "Smooth Retiring" 

In this section we present our best algorithm, building on the insights gained in the prevous sections. 

Algorithm SR„ is not quite as easy to describe as our earlier algorithms. It is best described as 
a partial algorithm with ordered rows, an optimal refinement of which will be presented in its cost 
analysis. 

SR„ divides the nodes into two groups: an upper group U = {0, . . . ,u—l} consisting of u nodes 
and a lower group L = {u, . . . , n — 1} consisting of c = n — u nodes (which is the cost we are aiming 
for). As can be expected, construction of SR„ presumes certain conditions on the relative sizes of 
u and c, which will be derived shortly. The value of c will then be chosen as the smallest which 
satisfies the conditions. 

The upper group engages in HalflnTurUu, while the lower group engages in a slight variation 
on AllInTur Uc in which each row is reversed. 

Row u + i will have length c — 1 — [|J , of which (u + i,n — 1) . . . 
n— 1 — {u + i) = c— 1 — i edges. That leaves c — 1 — [|J — (c — 1 — i) 
front of row u + i, to be filled with edges to U. 

Row i < u starts with the [|] or [|J edges in HalfInTurn„, leaving up to c - 
to be filled with edges to L. The picture so far (with u = 6,c = n — u = 8) is 



{u + i,u + i + 1) are the last 
= [|] 'slots' available at the 



[|J slots per row 
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Asterisks indicate empty slots. By simple geometric properties of the picture we analyze the 
requirements. Define block Bjj as the elements in the upper u rows determined by U and define 
block Bl as the elements in the lower c = n—u rows determined by L. The block Bjj has uc elements 
of which (2) are used for the edges in U x U. There are uc — (2) slots in U that can be used for 
edges from U to L. In the lower block Bl the number of open slots that can be used for edges from 
Land [/equals (c-1) + (c-3)H h2= (c^ - l)/4 for odd c and (c-l) + (c-3)H hi = c^/4 



for even c. That is [c^/4j open slots. In order to fit all uc edges between U and L, the number of 
open slots must be sufficient: 

uc- i^\ + Lc^/4J > uc. 

As it happens, the number of elements in Bfj, that is uc, equals the number of edges between U 
and L. Therefore, 



(1) 



^fjM. 



In the example, the 16 lower slots make up for the 15 which HalflnTurng takes out of the top 
section of size 6 • 8 = 48. 

2.6 Filling in the slots 

The bottom slots are filled in from top to bottom, left to right, modulo u, starting with (u + 1, 0). 
The top slots are then filled with the remaining edges, in reverse order: 
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We assume that u is at least the maximum number of slots per row [c/2j , to avoid filling a row 
twice with the same edge: 

(2) 



I c , 



This condition also finds use in the next subsection to show optimality of a certain refinement. 

The tournament underlying this partial algorithm is shown in Figure |2|. Figure |3| makes the 
pattern clearer with the bigger instance u = 21,c = 29. 



2.7 Cost analysis 

Theorem 2 Partial algorithm SRn has cost c = \{2 — ^/2){n — 1)] 

Proof. To satisfy condition (|l|), it suffices to have 



0.586n). 



c^ (n — 1 — cY 
T - 2 




Figure 1: HalflnTurnis 




Figure 2: SRg+s 




Figure 3: SR21+29 
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or equivalently, (? — 4(n — 1) + 2(n — 1)^ < 0, which, solving for c, translates to c > (2 — \/2)(n — 1). 
It remains to show that SR^ actually has cost c. This we do by presenting a total refinement 
sequence and verifying all retiring costs. 

First, all edges in L x L are retired, bottom-up and right to left. Upon retirement of edge 

{u + i,u + j), \Eu+i — R\ equals \Eu+if^L xU\+n — {u + j), while \Eu+j — R\ equals \Eu+j PlL x C/|, 
giving a retiring cost of 

r^i+c-j + r|i=c+r^i-L|j<c, 

since i < j. 

Next, all edges {i,u + j) & U x L are retired, in increasing order of j. Upon retirement of edge 

{i,u + j), 

\Ei-R\ =c-\{k<j : {u + k,i) ^ E^+k}\ 

= c-ij- \{k <j:iu + k,i)G Eu+k}\) <c-ij- r^l), 

4m 

since the number of slots in the first j bottom rows equals (j — 1) + (j — 3) + • • • = [j^/4j , while i 
appears once in every u consecutive slots. Condition (g) implies 

j c — 1 

— < < 1, 

2u - 2u - ' 



and hence 



\m-R\<c- ij -ll\.\J-])<c- (j -[l\)<c-\^-]. 
2 2u 2 2 



Combined with l-En+j — R\ 1^ fll ^® conclude c(z, u + j) = \Ei — R\ + \Eu+j — R\ < c. 

Next, all edges in HalfInTurn„ are retired in their usual order at maximum cost u — 1, which, 
by condition (|l]), is bounded by c. 

Finally, all edges in L x U are retired in arbitrary order, at costs no more than [|J . D 

3 Asynchronous Case for Two Agents 

In an asynchronous setting, one cannot rely on queries from different agents to be coordinated 
in time. In some cases the agents will have no access to a clock, in other cases the clocks may 
be subject to random fluctuations. In the asynchronous model, all an agent can control, is what 
other sites are queried, and in what order. We formalize an asynchronous mutual search (AMS) 
algorithm as a partially ordered tournament in which the rows are totally ordered and edges from 
different rows are unordered. R The cost of an edge is defined as its position in the row-ordering 
(querier cost) plus the length of the target row (queree cost), since it may happen that the queree 
has already made all of its queries. 

Upper bound: With relatively little control over the ordering of queries, it seems even less 
likely to find algorithms which improve on the intuitive bound of n — 1 queries. For instance. 



® There is a subtlety here. In the synchronous case, we aUow only one of any two given sites to query the other 
(unidirectional), reasoning that if both try to query the other, then one of those queries will always be made first. In 
the asynchronous case however, there is no control over which query occurs first, and thus we need to allow for more 
general, bidirectional algorithms (which we refrain from defining formally here). Although there may be possible 
benefits to having two sites query each other, we have been unable to find ways of exploiting this. We conjecture that 
for any bidirectional algorithm, there exists a unidirectional algorithm of the same or less cost. Since bidirectional 
algorithms don't fit too well in the existing model, and since we lack nontrivial results regarding them, we use the 
above unidirectional definition of AMS algorithm in the remainder of this section. 
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Lemma |^ no longer holds in the asynchronous case. But, surprisingly, a variation of SR„, called 
ASRn, achieves about 1.5 times its cost. It is obtained by reversing within every row the order of 



edges pointing to nodes in the lower group L of section 2.6. The example there now becomes: 
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The key observation is that the shortest row has half the length of the maximum row and that 
edges to nodes with shorter rows appear in the later positions. Using an analysis similar to that of 
Theorem g, one arrives at: 

Theorem 3 Asynchronous algorithm ASRn has cost at most ((5 — -v/2)/4)n (~ 0.896n). 

Proof. We check that every one of the four types of edges has an asynchronous cost of at most 

(note that u = n — c): 

3 3n + c 

C+ 7" = -A ) 

4 4 

where c = [(2 — v2)(n — 1)] as in Theorem 0. For some edges we use the fact that c < |u, and 
show that the cost is at most c + gC Recall that row u + j has length c — 1 — [|J . 
Edge {u + i, u + j) £ L X L has asynchronous cost: 






<c-2+[ 



c-1. 



Edge {i,u + j) £ U X L has asynchronous cost: 



u 1 

<-+j-\{k<j:{u + k,i) G^„+fc}|+c-l- L^J 

<c-i + r^i + --L— J 

'2' 2 Uu 



<c + 



j + u-f/2u 
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Writing j as xu gives 

since x + 1 — x^/2 assumes its maximum at x = 1. Hence 



3 

j + u- f/2u = {x + 1- x'^/2)u < -u, 



j + u-f/2u 3 

c H <c + -u. 

Edges in HalflnTurnu (U x U) have asynchronous cost at most: 

u 3 

- + c< -c. 
2 - 2 

Finally, edges in L x U have cost at most: 

rC — 1-, 3 



D 



Lower bound: The (4 — 2\/3)(n — 1) lower bound on the synchronous case (Theorem || holds 
a fortiori for the asynchronous case. 

4 Randomized Case for Two Agents 

For a randomized MS protocol the worst-case expected cost is the worst case, over all agent locations, 
of the expected (over the random coin flips) number of queries. We can use randomization to 
obtain an algorithm for mutual search with expected complexity below the proven lower bound for 
deterministic algorithms, namely, a cost of n/2. 

Upper bound: Algorithm RandomHalfTnConcertn uses the same tournament as HalfInTurn„, 
but each agent randomizes the order of its queries, and the querying proceeds "in concert," in 
rounds that give every row one turn for their next query. An example where the random choices 
have already been made can be depicted as 



2 1 

2 3 

3 4 

4 

1 



Theorem 4 Algorithm RandomHalflnConcerta has a worst- case expected cost ^^^ for n is odd 
and about [^^^] for n is even. 

Proof. A worst case occurs when an agent located at node n — 1 ends up querying the other 
agent at node (with the latter already having made a query in that round). The expected number 
of queries is twice the number of queries the agent at n — 1 makes in a uniformly random order of 
the sites 0, 1, ... , [^^] ending in, and including, the final successful query to site 0. This is about 
[^^] for n is even and ^^ for n is odd. □ 

Asynchronous Randomized Case: Allowing randomness in the algorithm, a ^ upper bound 
is obtained by a variation on RandomHalfInConcert„ in which each row is ordered randomly. This 
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appears (but is not proven) to be the best one can do. The best lower bound we have is the 
synchronous randomized ^^^ lower bound below. 

Lower bound: We prove a lower bound for the synchronous case (and hence for the asyn- 
chronous case). 

Theorem 5 For every randomized MS algorithm for two agents on n sites using a hounded number 
of coin flips the worst-case expected cost is at least ^^^-g-- 

Proof. Fix a randomized MS algorithm for n sites with k = 2 agents. We assume that every 
agent uses at most a number of coin flips that is bounded by a total function of n. Assume that the 
maximum of the expected number of queries is c where the maximum is taken over all placements 
of two agents on n sites and the expectation is taken over the randomized coin flips. By Markov's 
inequality, for every placement of the two agents there is probability at least 2 that there are at 
most 2c queries by both parties together up to contact. Hence, at least ^ of all combinations of 
agents' positions and sequences of used coin flips have cost at most 2c. 

Now consider a matrix where the rows correspond to the agents' positions and the columns to 
the pairs of sequences of coin flips. It is important for the remainder of the proof that two agents 
at i,j can use sequences of coin flips a and /3 in two ways: the agent at i uses a and the agent at 
j uses (5 and vice versa. [] Therefore, there are 2(2) rows and (by the boundedness of the length 
of the coin flip sequence) a bounded number of columns. By the pigeon hole principle at least one 
out of all coin flip sequence pairs (the columns) has at least 2 of the 2(2) agents' positions (the row 
entries) incurring cost at most 2c. 

Fix any such column, say the one determined by coin flip sequences a and (3. We define a 
"deterministic pseudo-MS" algorithm for two agents in 2n sites by splitting every original site i 
into two copies: a site ia and a site ip (0 < i < n — 1). Every new site i^ (7 G {a, (3}) executes a 
new deterministic algorithm based on the coin flip sequence 7. This new algorithm is completely 
specified by the following: If the original algorithm specifies that an agent at site i using coin fiip 
sequence a queries site j in time slot t, then the new algorithm specifies |^ that an agent at site ia 
queries site j^ in time slot t (0 < i ^ j < n — 1, t := 1,2...). The analogous rule holds with a and 
P interchanged. f\ 

Consider only the (at least) (2) row entries (pairs of nodes) that have cost at most 2c and put 
an arc from node ia to node jjs if ia queries j^ in the new algorithm. This results in a directed 
bipartite graph on 2n nodes with at least (2) arcs. The average number of outgoing arcs per node is 
at least ^^. Fix a node with at least ^^ outgoing arcs, say ia, and consider the last node queried 
by ia, say node jp. Then ia queries jp but j^ doesn't query ia- Hence for the pair {ia,jp) there 
are at least ^^^ queries executed which, by assumption, is at most 2c. Therefore the expectation 
c> ^. □ 

5 Synchronous Case for Many Agents 

In the case of fc > 2 agents we define the mutual search as before, but now the two agents involved 
in a query with an affermative answer, as well as their nodes, "merge" into one, sharing all the 



^Otherwise it can happen that in the same column in agent positions (J,j) the agent at i uses sequence a and in 
the agent positions (i,k) with k ^ j the agent at i uses sequence /3 (/3 7^ a). This would contradict our intended 
reduction to a deterministic algorithm. 

*In the actual random execution the pair of coin flip sequences in use in this fixed column is a and /3. Therefore, 
if there are agents at site i and site j and the agent at site i uses coin flip sequence a then the agent at site j uses 
coin flip sequence /3. 

^Note that a-sites only query /3-sites and the other way around. 
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knowledge they acquired previously. A query of some node then becomes a query to the equivalance 
class of that node. In this view the goal of the problem is to merge all agents into one. q 

In the two agent case, an agent has no input or "knowledge" other than the index of the site he 
is located at. In the multi-agent case we assume a "full information protocol" where every agent 
is an equivalence class whose knowledge comprises the complete timed querying and answering 
history of its constituent agents . Consequently, algorithms in the new setting have a vast scope 
for letting the querying behavior depend on all those details in case k > 2. Limiting the number of 
agents in the new setting to two reduces exactly to our old model, f^ 

We now describe algorithm RS„^fc (for "RingSegments" ) for k agents . The algorithm has a 
cost below n for all k = o{^/n). Algorithm RS„.fc splits the n-node search space into a "ring" R of 
k{k — l)m nodes and a "left-over" group L of m nodes. For simplicity of description we assume 
that n is of the form (k{k — 1) + l)m. 

The algorithm consists of two phases. During the first phase, agents residing on the ring engage 
in a sort of HalflnTurn making {k — l)m queries ahead in the ring. During the second phase, if not 
all the agents are completely joined yet, agents query all the leftover nodes. If, in the first phase, 
one agent queries a node affirmatively, then the agents merge and the merged agent continues where 
the front agent left off, adding up the number of remaining ring queries of both. The latter ensures 
that a collection of k' agents on the ring ends up querying k'(k — l)m of ring nodes, with no node 
queried twice. 

Theorem 6 Algorithm RS„.fc has cost k{k — \)m. 

Proof. Let k' be the number of actual agents residing on the ring. Consider first the case k' < k. 
Then 

c(RS„,fc) = k'{k - l)m + k'm 

ring queries left— over queries 

< (A; — 1) [{k — l)m + m] = {k — l)km . 

Otherwise {k' = k), the agents find each other around the ring, making (k — l)m queries each 
in the worst case. □ 



6 Conclusion 

The lower and upper bounds for the synchronous deterministic two agent case leave a small gap. 
We suspect Lemma |^ of being unnecessarily weak. It is tempting to try and prove a strengthened 
version claiming a length of no more than (c + k)/2 for the (k + l)th shortest row, which would 
immediately imply the optimality of SR„. All algorithms we have looked at so far satisfy this 
condition. Unfortunately, there exist simple counterexamples, as witnessed by row distribution 
\y — where the upper half engages in a HalflnTurn algorithm before querying the lower half, which 
in turn engages in an AllInTurn algorithm (giving a saturated result). Such algorithms however 
have lots of relatively short rows, making them far from optimal. It seems reasonable to expect 
that an optimal algorithm has only a constant number of rows shorter than half the cost. In this 
light we pose the following conjecture as a lead on optimality of SR„: "Let T be an algorithm for 



^" Of course, there are other possibihties to generalize the Mutual Search problem fc > 2 agents , in terms of how 
agents that have contacted one another coordinate the remainder of their mutual search. 

^^ We refrain from giving complicated formal definitions of a multi-player MS" protocol and cost measure which are 
not needed for the simple upper bound derived here. 
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n sites with cost c, such that no row is shorter than [|J . Then the (A; + l)st shortest row of T has 
length at most (c + k)/2." 

The randomized and asynchronous two-agent cases leave large gaps between lower bound and 
upper bound. The multi-agent case is almost completely unexplored for all models. The same holds 
for bidirectional asynchronous algorithms as in footnote |6|. 
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A Related Work 

Distributed Match-Making. In "distributed match-making" [jl^ the set-up is similar to mutual 



search except that if an agent at node i queries a node k about an agent residing at node j and 



the latter agent has posted its whereabout at node k, then the query to node k returns j [17, 13 1. 
In general it is assumed that the search is in a structured database in the sense that there have 
been an initial set of queries from agents at all nodes to leave traces of their whereabouts at other 
nodes. This problem is basic to distributed mutual exclusion [15] and distributed name server |17|. 
The difference is that distributed match-making operates in a cooperative structured environment 
while mutual search operates in a noncooperative unstructured environment. Some of our protocol 
representation ideas were inspired by this seminal paper. 

Tracking of Mobile Users. Another related search problem is the (on-line) tracking of a mobile 
user defined by Awerbuch and Peleg [|l|, g], where the goal is to access an object which can change 
location in the network. The mobile user moves among the nodes of the network. From time to 
time two types of requests are invoked at the nodes: move{i,j) (move the user from node i to node 
j) and find{i) (do a query from node i to the current location of the user). The overall goal is to 
minimize the communication cost. In contrast, our search problem is symmetric, and the agents 
are static. 

Distributed Tree Construction. The goal of MS* can be thought of as forming a chque among 
the nodes at which the agents are located. In this sense the problem is related to tree construction 
problems, such as the (distributed) minimum- weight spanning tree (MST) [^ and Steiner tree |^. 
Besides other differences MS is concerned with optimizing the process, and not the outcome of the 
construction. 

Conspiracy Start-Up. Another possible application of M^is to secure multi-party computation. 
Fault-tolerant distributed computing and secure multi-party computation are concerned with n 
agents , a fraction (t) of which may be faulty. It is traditionally assumed [HI, |T3] that every faulty 
agent has complete knowledge of who and where all faulty agents are, and that they can collude and 
act in concert. We would like to weaken this assumption and investigate the complexity and cost 
of achieving such a perfect coordination. We consider this paper as a first step towards the study 
of such spontaneous adversaries and coalition forming. In fact, many test-bed problems (Byzantine 
agreement [0]) and secure multi-party primitives (verifiable secret sharing ||6|) are bound to have 
interesting characterizations and efficient solutions under this new adversary. 



17 



Probabilistic Coalition Formation. Billard and Pasquale |^] study the effect of commu- 
nication environments on the level of knowledge concerning group, or coalition, formation in a 
distributed system. The motivation is the potential for improved performance of a group of agents 
depending on their ability to utilize shared resources. In this particular model the agents make 
randomized decisions regarding with whom to coordinate, and the payoffs are evaluated in different 
basic structures and amounts of communication (broadcast, master-slave, etc.). Their work has 
in turn been influenced by work on computational ecologies ||l^ and game theory studies \lt]. In 



contrast, ours is a search problem with the goal of minimizing the communication cost of achieving 
a perfect coalition. 



Search Theory. Finally, MS is also related to search theory and optimal search [12|. Search 



theory is generally concerned with locating an object in a set of n locations, given a "target 
distribution," which describes the probability of the object being at the different locations. In 
turn, optimal search involves computing how resources (like search time) can be allocated so as 
to maximize the probability of detection. Typically, it is assumed that the target distribution is 



known, although more recently this assumption has been relaxed [18|. Besides the multiple agent 



aspect, the setting of MS is more adversarial, as we measure worst-case cost. 
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